The complete chloroplast genome and phylogenetic analysis of Jacobaea maritima (Asteraceae)

Abstract Jacobaea maritima is an important horticulture plant in the genus Jacobaea. Here, we assembled the complete chloroplast genome of J. maritima. The chloroplast genome was 153,857 bp in length, with a pair of inverted repeat regions (IRs) (27,936 bp) separated by a large single-copy region (LSC) (82,771 bp) and a small single-copy region (SSC) (15,214 bp). The complete chloroplast genome contained 112 unique genes, including 79 protein-coding genes, 29 tRNA genes, and 4 rRNA genes. The phylogenetic analysis showed Jacobaea was more closely related to Senecio, Crassocephalum and Gynura. The chloroplast genome of J. maritima can provide data to support future phylogenetic studies of Jacobaea.

Jacobaea maritima; chloroplast genome; horticulture plants; phylogeny Background Jacobaea maritima (L.) Pelser & Meijden belongs to the tribe Senecioneae branch in the family Asteraceae, and prefers to grow in cliffs and rocky coastal areas, usually distributed in the Mediterranean region, northwest Africa, southern Europe and western Asia, while in China it is mainly used as an ornamental plant (Galasso and Bartolucci 2015;Passalacqua et al. 2008). The Jacobaea was rich in secondary metabolites, including flavonoids, pyrrolizidine alkaloids, triterpenoids and aromatic acids (Maggio et al. 2015). Therefore, Jacobae species had become important for its ornamental and medicinal value and were grown in most areas with great climatic adaptability to different regions.
The phylogenetic study of Jacobaea is mainly based on plastid fragments and nuclear genes ( Figure S1). In 2002, the molecular phylogeny of Jacobaea was studied to clarify species composition and interspecific relationships of Jacobaea (Pelser et al. 2002). A resolution of 60 representative species of the tribe Senecioneae based on DNA sequence data from the plastid sequences (trnT-trnL, trnL intron, trnK intron, and matK gene) and nuclear sequences (ITS1, 5.8S, and ITS2) revealed that Jacobaea was a strongly supported monophyletic group. The genera Emilia, Packera and Pseudogynoxys formed the sister clade of Jacobaea, but this relationship lacked strong bootstrap support. In 2003, the species composition and molecular phylogeny of Jacobaea were studied to identify the closest relatives of J. vulgaris (Pelser et al. 2003). Maximum parsimony and Bayesian inference analyses of DNA sequence data of the plastid (trnT-L, trnL intron, trnK intron, and psbA-trnH) and nuclear genome (ITS1, 5.8S, and ITS2) showed these markers to be suitable to assess the species composition of Jacobaea. In 2008, researchers performed phylogenetic analysis of the J. maritima group based on ITS1 sequence (Passalacqua et al. 2008). In 2021, a systematic study of diversity, polyploidy and morphology in the family Asteraceae based on single-copy nuclear genes, showed that the tribe Senecioneae was a monophyletic group and Jacobaea had a close relationship with Senecio and Emilia (Zhang et al. 2021).
However, studies on the chloroplast genome of J. maritima are still lacking, and phylogenetic questions regarding Jacobaea have not been addressed. In this study, we sequenced and assembled the chloroplast genome of J. maritima and performed comparative genomic analyses with the intention of further revealing the chloroplast genome characteristics of Jacobaea species. These results will provide a data base for phylogenetic studies of the tribe Senecioneae and Jacobaea species.

Plant material, DNA extraction and sequencing
Fresh leaves of J. maritima were sampled from second-year old plants in Xinyang, Henan, China (Figure 1;114.21667 E, 32.28333 N). The voucher specimen was deposited in the herbarium of Xinyang Agriculture and Forestry University (voucher number: XJM22004; Mr. Gong, 2000230016@xyafu.edu.cn). The total genomic DNA was extracted using the HiPure Plant DNA Mini Kit (Magen, Guangzhou, China). A DNA library with an average length of 500 bp was constructed according to the instructions of the TruSeq DNA PCR-Free LT Sample Prep kit, followed by high-throughput sequencing (paired-end 250 bp) using the Illumina NovaSeq6000 platform. Raw reads were filtered using Trimmomatic v. 0.39 with default parameters retrieving the clean data.

Phylogenetic analysis
To confirm the phylogenetic position of J. maritima, the chloroplast genomes of 20 Asteraceae species and one Brassicaceae species were downloaded from GenBank. The Arabidopsis thaliana was used as outgroup. We extracted 75 common proteincoding genes from the genome annotation files using PhyloSuite v. 1.2.2 (Zhang et al. 2020). Each protein-coding gene sequence was aligned using MAFFT v. 7.4 (Katoh and Standley 2013), and the 75 aligned sequences were concatenated. Based on the matrix of concatenate sequence, a phylogenetic tree was constructed using the maximum likelihood (ML) method implemented in IQ-TREE v. 2.1.2 (Nguyen et al. 2015), and the best model was inferred from ModleFinder (Kalyaanamoorthy et al. 2017

General features of the chloroplast genome
We examined the chloroplast genome for completeness, coverage depth and collinearity, and the results showed that  the chloroplast genome was reliable ( Figure S2). Analysis of the genome annotation results also showed that the cis-and trans-splicing gene annotations were correct ( Figure S3). The complete chloroplast genome of J. maritima was a typical circular tetrameric structure of 153,857 bp in length ( Figure 2;  (Table S1). The chloroplast genome had a total GC content of 37.13% and the IR region had a GC content (41.37%) was significantly higher than that of the LSC region (35.43%) and the SSC region (31.81%). In the study, we detected 128 SSRs in the chloroplast genome of J. maritima, including 34 mononucleotides, 41 dinucleotides, 19 trinucleotides, 24 tetranucleotides, 7 pentanucleotides and 3 hexanucleotides (Table   1). Most SSRs were mononucleotides and dinucleotides, accounting for 58.59% of the total. In the chloroplast genomes of J. maritima and J. vulgaris, SSRs were most abundant in the LSC region and least in the IR region, and mainly concentrated in the coding region, with fewer SSRs in the non-coding region (Table 1; Table S2). This phenomenon may be related to the sequence conservativeness and GC content of different regions, etc. In addition, we detected 4 interspersed repeats in the chloroplast genome of J. maritima, including one forward repeat and 3 palindromic repeats.

IR boundaries analysis
In general, the length variation of the LSC/SSC region was lower than that of the IRa/IRb region. The chloroplast genome of J. maritima showed the expansion of the IR region and contraction of the SSC region compared to the chloroplast genome of J. vulgaris ( Figure S4). In the chloroplast genome of J. maritima, rps15 and rpl32 were located within the SSC/IR boundary, and ndhF and rpl2 were located outside the SSC/IR boundary. In contrast, in the chloroplast genome of J. vulgaris, ycf1 was located within the SSC/IR boundary, ndhF was located within the SSC/IR boundary, and rpl2 and trnN were located outside the SSC/IR boundary.

Phylogenetic analysis
To understand the phylogenetic position of J. maritima in the Asteraceae, we performed a phylogenetic analysis. The maximum-likelihood tree was constructed using IQ-TREE v. 2.1.2 (Nguyen et al. 2015) with the best-fit model of TVM þ FþR4.
The phylogenetic analysis showed high bootstrap for most of nodes in the phylogenetic tree, showing the reliability of the phylogeny (Figure 3). Our results clearly showed that Jacobaea was more closely related to Senecio, Crassocephalum and Gynura.

Discussion
Compared with nuclear and mitochondrial genomes, chloroplast genomes are highly conserved and have been widely used in phylogenetic and evolutionary studies. With the development of high-throughput sequencing technology, chloroplast genome sequences play an important role in phylogenetic study (Li et al. 2021). In this study, the chloroplast protein-coding genes tree showed high support with credibility. Our phylogenetic results were also in general agreement with the previous results ( Figure S1) (Passalacqua et al. 2008;Zhang et al. 2021), which showed that Jacobaea was more closely related to Senecio, Crassocephalum and Gynura. Future enrichment of chloroplast genomic data of Jacobaea species is needed to further resolve intra-generic phylogenetic relationships. However, these results are limited due to the matrilineal inheritance of the chloroplast genome (Krawczyk et al. 2018). An accurate phylogenetic relationship still requires an integrated analysis of the nuclear and organelle genomes (G orniak et al. 2010). More molecular data is needed in the future to determine the relationships between Jacobaea and other genus in the family Asteraceae.

Conclusion
In this study, the chloroplast genome of J. maritima was de novo assembled with short reads. This chloroplast genome had a typical tetrameric structure similar to that of most angiosperms. In addition, the phylogenetic tree strongly supported the phylogenetic position of J. maritima, showing that Jacobaea was more closely related to Senecio, Crassocephalum and Gynura. Thus, the chloroplast genome of J. maritima not only enriches the genomic information of Jacobaea, but also lays the foundation for understanding the evolution of Asteraceae species.

Ethical approval
The author has read the manuscript and has approved this submission. Ethical statement is not applicable. The research on plants used in this study, including the collection of plant material has been carried out in accordance with guidelines provided by our institution. Field studies in our manuscript have complied with local legislation and appropriate permissions/license were granted while taking samples from a preserved/protected land.

Author contributions
Shoufu Gong directed the study and designed the experiments. Kai Zhang and Shoufu Gong performed the data processing. Kai Zhang drafted the manuscript. All authors approved the final draft.

Disclosure statement
No potential conflict of interest was reported by the authors.

Data availability statement
The contact person of the specimen is Shoufu Gong (2000230016@xyafu. edu.cn). The genome sequence data that support the findings of this study are openly available in GenBank of NCBI at (https://www.ncbi.nlm. nih.gov/) under the accession no. OL960706. The associated BioProject, SRA, and Bio-Sample numbers are PRJNA775801, SRX12813512, and SAMN22625029 respectively.